A Fuzzy matching duplicate check checks for duplicates applying fuzzy logic. A Fuzzy matching duplicate check compares, for a selected record, several field values with the values of the same fields of other records.

When duplicate values are found in another record:

  1. The duplicate score is calculated. The duplicate score is calculated based on the field weightage as defined for the duplicate check.
  2. The calculated duplicate score is compared with the threshold as defined for the duplicate check.
  3. If the duplicate check is equal to or higher than the threshold, the record is reported as possible duplicate.

Example:

Duplicate check on CustTable

Threshold: 50%

Table name Datasource name Field Field label Weightage
CustTable CustTable AccountNum Customer account  
CustCustomerV3Entity CustCustomerV3Entity AddressStreet Street 1
CustCustomerV3Entity CustCustomerV3Entity AddressZipCode ZIP/postal code 1
CustCustomerV3Entity CustCustomerV3Entity OrganizationName Organization name 6
CustCustomerV3Entity CustCustomerV3Entity PrimaryContactEmail Primary email 3
CustCustomerV3Entity CustCustomerV3Entity PrimaryContactPhone Primary phone 3

Calculation examples:

  • Duplicate values exist in the Primary email field and in the Primary phone field. The duplicate score is: 6 / 14 * 100 = 42,86. The record is not reported as possible duplicate.
  • Duplicate values exist in the Organization name field and the ZIP/postal code field. The duplicate score is: 7 / 14 * 100 = 50. The record is reported as possible duplicate.

Review and merge duplicates

You can review the found duplicates.

To solve duplicates, you can:

  • Merge field values from the duplicate records to a chosen master record.
  • Manually delete undesired duplicate records.


Standard procedure

1. Go to the form from where you want to check if duplicates exist for a record.
2. In the list, find and select the desired record.
3.

Start the fuzzy duplicate check.

Only one duplicate check is done. This is the first found active Fuzzy matching duplicate check that:

  • Applies to the main table of the form.
  • Is used in a duplicate check rule of an active data quality policy.
  Click Check for duplicates.
 

Note:

On the applicable form, on the Action Pane, depending on the setup, the Check for duplicates button can be shown:

  • On the 'Data quality' tab, in the 'Duplicate check' button group.
  • As a separate button.
  • On an existing action pane tab, in the 'Duplicate check' button group.

4. If possible duplicates are found, automatically, the Duplicate records found page is shown.
  Review the possible duplicates that are found by the fuzzy duplicate check.
5. Sub-task: Merge field values of duplicate records.
  5.1 You can merge field values of the found duplicate records to a chosen master record.
  In the Duplicates found section, click Proceed to merge.
  5.2 The master record to which the field values are merged is shown in the Master record section.
By default, the originally selected record is the master record.
You can select another of the shown duplicate records as the master record. As a result, this record is shown in the master record section.
  For the desired record, select the Master record check box.
  5.3 The duplicate check setup defines which fields can be merged. These fields are shown in the Duplicates found section, preceded by a 'merge' check box.
Select the 'merge' check for each field value that you want to merge to the master record. You can only select one 'merge' check box for each field.
  Select the desired 'merge' check boxes.
  5.4 Click Merge.
  5.5 Click Yes.
6. Sub-task: Delete undesired duplicate records.
  6.1 When you merge field values to a master record, the found duplicate records are not deleted automatically. The main reason is that the found duplicate records can be referenced in other records.
Therefore, if desired, delete the undesired duplicate records manually.
  In Duplicates found section, in the list, find and select a duplicate record that you want to delete.
  6.2 Click to follow the link in the Identifier field.
  6.3 Click Delete.
 

Note: Before you delete a record, make sure it is not referenced in another record.

  6.4 Click Yes.
  6.5 Close the page.
7. Close the page.
8. Close the page.

Notes

If you run a quality assessment, duplicate checks of type 'Fuzzy matching' are done as well. If a duplicate record is found, in the Quality assessment results, a warning is shown for the record. The message shows the number of duplicate records found.

See also

Provide feedback